Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available May 1, 2026
-
Recently, graph neural network (GNN)-based algorithms were proposed to solve a variety of combinatorial optimization problems [M. J. Schuetz, J. K. Brubaker, H. G. Katzgraber,Nat. Mach. Intell.4, 367–377 (2022)]. GNN was tested in particular on randomly generated instances of these problems. The publication [M. J. Schuetz, J. K. Brubaker, H. G. Katzgraber,Nat. Mach. Intell.4, 367–377 (2022)] stirred a debate whether the GNN-based method was adequately benchmarked against best prior methods. In particular, critical commentaries [M. C. Angelini, F. Ricci-Tersenghi,Nat. Mach. Intell.5, 29–31 (2023)] and [S. Boettcher,Nat. Mach. Intell.5, 24–25 (2023)] point out that a simple greedy algorithm performs better than the GNN. We do not intend to discuss the merits of arguments and counterarguments in these papers. Rather, in this note, we establish a fundamental limitation for running GNN on random instances considered in these references, for a broad range of choices of GNN architecture. Specifically, these barriers hold when the depth of GNN does not scale with graph size (we note that depth 2 was used in experiments in [M. J. Schuetz, J. K. Brubaker, H. G. Katzgraber,Nat. Mach. Intell.4, 367–377 (2022)]), and importantly, these barriers hold regardless of any other parameters of GNN architecture. These limitations arise from the presence of the overlap gap property (OGP) phase transition, which is a barrier for many algorithms, including importantly local algorithms, of which GNN is an example. At the same time, some algorithms known prior to the introduction of GNN provide best results for these problems up to the OGP phase transition. This leaves very little space for GNN to outperform the known algorithms, and based on this, we side with the conclusions made in [M. C. Angelini, F. Ricci-Tersenghi,Nat. Mach. Intell.5, 29–31 (2023)] and [S. Boettcher,Nat. Mach. Intell.5, 24–25 (2023)].more » « less
-
null (Ed.)We consider the problem of finding a two-layer neural network with sigmoid, rectified linear unit (ReLU), or binary step activation functions that "fits" a training data set as accurately as possible as quantified by the training error; and study the following question: \emph{does a low training error guarantee that the norm of the output layer (outer norm) itself is small?} We answer affirmatively this question for the case of non-negative output weights. Using a simple covering number argument, we establish that under quite mild distributional assumptions on the input/label pairs; any such network achieving a small training error on polynomially many data necessarily has a well-controlled outer norm. Notably, our results (a) have a polynomial (in d) sample complexity, (b) are independent of the number of hidden units (which can potentially be very high), (c) are oblivious to the training algorithm; and (d) require quite mild assumptions on the data (in particular the input vector X∈ℝd need not have independent coordinates). We then leverage our bounds to establish generalization guarantees for such networks through \emph{fat-shattering dimension}, a scale-sensitive measure of the complexity class that the network architectures we investigate belong to. Notably, our generalization bounds also have good sample complexity (polynomials in d with a low degree), and are in fact near-linear for some important cases of interest.more » « less
-
null (Ed.)We consider the problem of finding nearly optimal solutions of optimization problems with random objective functions. Such problems arise widely in the theory of random graphs, theoretical computer science, and statistical physics. Two concrete problems we consider are (a) optimizing the Hamiltonian of a spherical or Ising p-spin glass model, and (b) finding a large independent set in a sparse Erdos-Renyi graph. Two families of algorithms are considered: (a) low-degree polynomials of the input-a general framework that captures methods such as approximate message passing and local algorithms on sparse graphs, among others; and (b) the Langevin dynamics algorithm, a canonical Monte Carlo analogue of the gradient descent algorithm (applicable only for the spherical p-spin glass Hamiltonian). We show that neither family of algorithms can produce nearly optimal solutions with high probability. Our proof uses the fact that both models are known to exhibit a variant of the overlap gap property (OGP) of near-optimal solutions. Specifically, for both models, every two solutions whose objective values are above a certain threshold are either close or far from each other. The crux of our proof is the stability of both algorithms: a small perturbation of the input induces a small perturbation of the output. By an interpolation argument, such a stable algorithm cannot overcome the OGP barrier. The stability of the Langevin dynamics is an immediate consequence of the well-posedness of stochastic differential equations. The stability of low-degree polynomials is established using concepts from Gaussian and Boolean Fourier analysis, including noise sensitivity, hypercontractivity, and total influence.more » « less
-
Consider algorithms with unbounded computation time that probe the entries of the adjacency matrix of annvertex graph, and need to output a clique. We show that if the input graph is drawn at random from(and hence is likely to have a clique of size roughly), then for everyδ<2 and constantℓ, there is anα<2 (that may depend onδandℓ) such that no algorithm that makesnδprobes inℓrounds is likely (over the choice of the random graph) to output a clique of size larger than.more » « less
An official website of the United States government

Full Text Available